Bandwidth-Aware Resource Management for Extreme Scale Systems
نویسندگان
چکیده
As systems scale towards exascale, many resources will become increasingly constrained. While some of these resources have historically been explicitly allocated, many, like network bandwidth, I/O bandwidth, or power, have not. As systems continue to evolve, we expect many such resources to become explicitly managed. This change will pose critical challenges to resource management and job scheduling. In this paper, we explore bandwidth-aware resource management for Blue Gene systems, where the partition-based interconnect architecture provides a unique opportunity to explicitly allocate bandwidth to jobs. In this paper we investigate the value of bandwidth awareness and further present a bandwidth-aware resource management design for Blue Gene systems.
منابع مشابه
Towards Next Generation Resource Management at Extreme-Scales
With the exponential growth of distributed systems in both FLOPS and parallelism (number of cores/threads), scientific applications are growing more diverse with various workloads. These workloads include traditional large-scale high performance computing (HPC) MPI jobs, and HPC ensemble workloads that support the investigation of parameter sweeps using many small-scale coordinated jobs, as wel...
متن کاملJointly power and bandwidth allocation for a heterogeneous satellite network
Due to lack of resources such as transmission power and bandwidth in satellite systems, resource allocation problem is a very important challenge. Nowadays, new heterogeneous network includes one or more satellites besides terrestrial infrastructure, so that it is considered that each satellite has multi-beam to increase capacity. This type of structure is suitable for a new generation of commu...
متن کاملTowards Measuring the Project Management Process During Large Scale Software System Implementation Phase
Project management is an important factor to accomplish the decision to implement large-scale software systems (LSS) in a successful manner. The effective project management comes into play to plan, coordinate and control such a complex project. Project management factor has been argued as one of the important Critical Success Factor (CSF), which need to be measured and monitored carefully duri...
متن کاملA characterization of workflow management systems for extreme-scale applications
Automation of the execution of computational tasks is at the heart of improving scientific productivity. Over the last years, scientific workflows have been established as an important abstraction that captures data processing and computation of large and complex scientific applications. By allowing scientists to model and express entire data processing steps and their dependencies, workflow ma...
متن کاملI/O-aware bandwidth allocation for petascale computing systems
In the Big Data era, the gap between the storage performance and an application’s I/O requirement is increasing. I/O congestion caused by concurrent storage accesses from multiple applications is inevitable and severely harms the performance. Conventional approaches either focus on optimizing an application’s access pattern individually or handle I/O requests on a low-level storage layer withou...
متن کامل